35 research outputs found

    Personalizing Session-based Recommendations with Hierarchical Recurrent Neural Networks

    Full text link
    Session-based recommendations are highly relevant in many modern on-line services (e.g. e-commerce, video streaming) and recommendation settings. Recently, Recurrent Neural Networks have been shown to perform very well in session-based settings. While in many session-based recommendation domains user identifiers are hard to come by, there are also domains in which user profiles are readily available. We propose a seamless way to personalize RNN models with cross-session information transfer and devise a Hierarchical RNN model that relays end evolves latent hidden states of the RNNs across user sessions. Results on two industry datasets show large improvements over the session-only RNNs

    Cross-domain recommendations without overlapping data: Myth or reality?

    Get PDF
    Cross-domain recommender systems adopt different tech- niques to transfer learning from source domain to target domain in order to alleviate the sparsity problem and im- prove accuracy of recommendations. Traditional techniques require the two domains to be linked by shared character- istics associated to either users or items. In collaborative filtering (CF) this happens when the two domains have over- lapping users or item (at least partially). Recently, Li et al. [7] introduced codebook transfer (CBT), a cross-domain CF technique based on co-clustering, and presented experimen- tal results showing that CBT is able to transfer knowledge between non-overlapping domains. In this paper, we dis- prove these results and show that CBT does not transfer knowledge when source and target domains do not overlap

    An efficient closed frequent itemset miner for the MOA stream mining system

    Get PDF
    Mining itemsets is a central task in data mining, both in the batch and the streaming paradigms. While robust, efficient, and well-tested implementations exist for batch mining, hardly any publicly available equivalent exists for the streaming scenario. The lack of an efficient, usable tool for the task hinders its use by practitioners and makes it difficult to assess new research in the area. To alleviate this situation, we review the algorithms described in the literature, and implement and evaluate the IncMine algorithm by Cheng, Ke, and Ng (2008) for mining frequent closed itemsets from data streams. Our implementation works on top of the MOA (Massive Online Analysis) stream mining framework to ease its use and integration with other stream mining tasks. We provide a PAC-style rigorous analysis of the quality of the output of IncMine as a function of its parameters; this type of analysis is rare in pattern mining algorithms. As a by-product, the analysis shows how one of the user-provided parameters in the original description can be removed entirely while retaining the performance guarantees. Finally, we experimentally confirm both on synthetic and real data the excellent performance of the algorithm, as reported in the original paper, and its ability to handle concept drift.Postprint (published version

    Methods for frequent pattern mining in data streams within the MOA system

    Get PDF
    IncMine is a robust, efficient, practical, usable and extendable solution to perform Frequent Itemset mining over data streams. It is implementend under the Massive Online Analysis framework. It includes an analysis over its performances and its reaction to synthetic and real concept drift

    Toward building a content-based video recommendation system based on low-level features

    Get PDF
    One of the challenges in video recommendation systems is the New Item problem, which happens when the system is unable to recommend video items, that no information is available about them. For example, in the popular movie-sharing websites, such as Youtube, every-day, hundred millions of hours of videos are uploaded and big portion of these videos may not contain any meta-data, to be used by the system to generate recommendations. In this paper, we address this problem by proposing a method, that is based on automatic analysis of the video content in order to extract a number representative low-level visual features. Such features are then used to generate personalized content-based recommendations. Our evaluation shows that our proposed method can outperform the baselines, by producing more relevant recommendations. Hence, a set low-level features extracted automatically can be more descriptive and informative of the video content than a set of high-level expert annotated features

    Bootstrapping a Music Voice Assistant with Weak Supervision

    Get PDF
    One of the first building blocks to create a voice assistant relates to the task of tagging entities or attributes in user queries. This can be particularly challenging when entities are in the tenth of millions, as is the case of e.g. music catalogs. Training slot tagging models at an industrial scale requires large quantities of accurately labeled user queries, which are often hard and costly to gather. On the other hand, voice assistants typically collect plenty of unlabeled queries that often remain unexploited. This paper presents a weakly-supervised methodology to label large amounts of voice query logs, enhanced with a manual filtering step. Our experimental evaluations show that slot tagging models trained on weakly-supervised data outperform models trained on hand-annotated or synthetic data, at a lower cost. Further, manual filtering of weakly-supervised data leads to a very significant reduction in Sentence Error Rate, while allowing us to drastically reduce human curation efforts from weeks to hours, with respect to hand-annotation of queries. The method is applied to successfully bootstrap a slot tagging system for a major music streaming service that currently serves several tens of thousands of daily voice queries

    Deriving Item Features Relevance from Past User Interactions

    Get PDF
    Item-based recommender systems suggest products based on the similarities between items computed either from past user prefer- ences (collaborative filtering) or from item content features (content- based filtering). Collaborative filtering has been proven to outper- form content-based filtering in a variety of scenarios. However, in item cold-start, collaborative filtering cannot be used directly since past user interactions are not available for the newly added items. Hence, content-based filtering is usually the only viable option left. In this paper we propose a novel feature-based machine learning model that addresses the item cold-start problem by jointly exploit- ing item content features and past user preferences. The model learns the relevance of each content feature from the collaborative item similarity, hence allowing to embed collaborative knowledge into a purely content-based algorithm. In our experiments, the proposed approach outperforms classical content-based filtering on an enriched version of the Netflix dataset, showing that collabo- rative knowledge can be effectively embedded into content-based approaches and exploited in item cold-start recommendation
    corecore